QuantuM 500 McCarthy Blvd., Milpitas, CA 95035 


Sirocco Error Recovery Notes 


1.0 Sirocco Error Recovery Parameters 
and Recommended Settings 


1.1 Parameters: Allow Control. 


Sirocco Error Recovery has several MR parameters. Some of these parameters 
are accessible by the customer through the rd/wr configuration command, while 
others are controlled by config page settings. Together, these parameters allow 
Sirocco to individually enable/disable TA (Thermal Asperity) Recovery, ID 
Recovery, AM (Address Mark) Recovery, Read Bias Current variations, or 
disable Error Recovery completely. 


1.2 Customer Control: Too Much? 


Due to the above-mentioned level of customer control, it is possible for 
customer tests programs to disable MR recovery features that are critical to the 
drive design. As figure 1.0 shows, Sirocco’s TA recovery “safety net” is made 
up of several levels of overlapping hardware and firmware solutions. Disabling 
the firmware solutions seriously impairs Sirocco’s ability to recover from MR- 
related phenomena, such as TA and Head Instability. Since TA and Instability 
retries are part of the normal retries and also rely on ECC, disabling retries or 
ECC at the drive-level is equivalent to disabling part of the drive. This leads to 
the dilemma of allowing customer control of features versus ensuring that 
needed firmware is not disabled by the customer. 


1.3 Solutions: Same End Result. 


One possible solution is to embed the MR-related retries into the firmware and 
not allow the user to disable these retries or ECC for MR errors. The other 
solution is request that customer test programs enable ECC and retries. In 
either case, the motivation is the same: 


ECC and Retries are an integral part of Sirocco’s drive design for MR- 
related errors. : 
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1.4 Recommended Customer Settings. 


With the above in mind, the recommended Error Recovery settings for Sirocco 
PMP drives are as follows: 


Minimum Customer Mode: 


Settings: Comments: 

Retries: 0 Minimum Courtesy Retries exist 

ECC Span: 16 #On-the-fly ECC need to be enabled 

AWRE: on AWRE needed for write-fault errors 

ARRE: off or ID errors on writes. 

ECC: off Courtesy 3-burst ECC exists for TAs 
Full Customer Mode: 

Settings: Comments: 

Retries: 8 Enable multiple retries 

ECC Span: 24 Enable full ECC power 

AWRE: on Allow Read and Write auto-reallocation 

ARRE: on 

ECC: on Enable triple-burst FW ECC 


In addition, the factory setting for MR_RECOV_PARMS (CP7, byte 9) is 37h to 
enable all MR recovery features. 


The Minimum Customer Mode setting should only be used for testing and to 
“stress” the drive. Under this setting, the drive will still perform minimal error 
recovery, including steps to recover from TA and Instability errors during both 
Read Operations and Write Operations. Any setting less than the minimum will 
impair TA and Instability recovery. The Full Customer Mode setting will ensure 
that enough recovery steps are invoked for most errors while not exceeding 
timeout conditions on the Host (15 seconds.) 
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2.0 Counting TA Errors: TA Count 


Sirocco offers the Host a way to measure the frequency of TA errors. The error 
recovery firmware increments a variable, TA_LCOUNT, each time a TA error is 
encountered. TA COUNT increments for each recovered and unrecovered TA 
errors. This variable increments from O and stops at OFFFFh (not allowed to 
rollover). TA_COUNT is not preserved at power down and will re-initialize to 0 
at each power up. The Host can read the current value of TA_LCOUNT by 
issuing a “Read Quantum Configurations” command. The value is returned at 
offsets 32h-33h. Typically, a Host test program reads this value at the 
completion of a test to get a measure of the occurrences of TA errors. Ifthe 
Host performs retries, or scans each sector several times, TA_COUNT should 
be divided appropriately to get the actual count of TA errors since the time of 
power on. 


3.0 Soft Error Rate Calculations and TA errors. 


The Sirocco program proposes that Recovered TA errors should not be 
included in “Soft Error Rate” or “Recoverable Error Rate” calculations. The 
reason being that TA errors are an inherent characteristic of MR drives. 
Furthermore, Sirocco has been designed to recover from these errors. Since it 
is inherent to the technology and not “soft” random errors, recovered TA errors 
should not be factored into the above calculations. 
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4.0 Error Recovery Time 


Sirocco error recovery time is normally quite fast. In rare cases, however, Error 
Recovery can take longer than normal due to “heroic” recovery steps to recover 
from ID errors or Address Mark (AM) errors. These steps are done as a last 
resort even though they require more time than normal because Sirocco places 
a higher priority on recovery of customer data. All this is achieved while 
minimizing Host time-out conditions caused by excessively long recovery time. 


4.1 Priorities 


Sirocco Error Recovery has been optimized to give the best performance 
possible. Performance is judged based on the following priorities: 


1. Data Integrity 
2. Data Recovery 
3. Recovery Time 


Data integrity remains the highest priority. Under no circumstances should 
Sirocco send bad data to the Host without indicating so. Data Recovery is the 
next priority. MR phenomena such as TA and Instability require Sirocco to take 
special approaches in dealing with these issues. Recovery Time is the last 
priority, but still a priority. Sirocco tries to do as much as possible to recover the 
customer data before giving up or causing the Host to time-out. 


4.2 Recovery Steps: Do Quick and Efficient Ones First. 


Under normal operating conditions, retries will not be necessary, even if there 
are “small” data errors since on-the-fly ECC is performed. If retries are 
necessary, then the most efficient recovery steps are performed first along with 
full-power firmware ECC correction. The early recovery steps include re-reads, 
off-track reads, and head-state-change (wiggle) recovery. These steps have 
been found to be effective for all types of errors and are faster than the “heroic” 
recovery steps. The time-consuming heroic steps, such as ID recovery and AM 
recovery are performed last. 


4.3 Worst Case Recovery Time: Rarely Encountered. 


In the case of certain disastrous errors, it might be necessary to perform more 
than one of the time-consuming steps, such as ID recovery, followed by AM 
recovery and ECC. In these situations, the recovery process can take up to 1.5 
seconds for each retry. The throughput drops severely, but the data is still 
recovered. In the worst case, where the error spans the ID field, Data AM, and 
beyond the correction span of the ECC, then the recovery process will exhaust 
the retries permitted by the Host before reporting an unrecoverable error. For 
example, if the Host sets the drive-level retry to 8, the recovery process can take 
up to 12 seconds before reporting a failure. It should be noted, however, that 
the above condition should rarely happen. As shown on figure 1.0, the drive 
design and defect-scanning should have mapped out these defects at the 
factory. “Grown” defects should be less severe than the case above, and even 
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then, the firmware will auto-reallocate after recovering from the error so that 
subsequently error recovery will not be required. 
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Error Recovery. The recovery steps are controlled by two parameters: Retry 
Count, and MR_RECOV_PARMS (Config Page 7, byte 9). 
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The following comments apply to both the Read Retry and Write Retry tables. 


The recovery steps (1 through 11 for Reads, and 1 through 8 for Writes) are 
performed for each count of Total Retry. When in Super mode, Total Retry is 


The number of Courtesy Retries depends on the error and can vary from 0 to 7. 


lf the Total Retry is 0, then all retries are disabled, regardless of the settings in 
MR_RECOV_PARMS. 


lf Total Retry > 0, then retries are performed. As described in the tables above, 
the setting of MR_RECOV_PARMS determines which MR recovery steps are 
enabled. In addition, the value of Total Retry also determines whether Read 
Bias Current Variations are enabled. For example if Total Retry is 1, then 
according to the tables above, the retry steps are only performed at 13mA, the 
optimum Read Bias Current. If Total Retry is 2, the retry steps are first done at 
13mA and if still unsuccessful, then at 11.2 mA. Similarly, if Total Retry is 3, 
then the retry steps are performed at 13mA, 11.2mA, and 14.8mA if necessary. 
lf Total Retry is 4, then the Read Bias Current is varied from 13mA, 11.2mA, 
14.8mA, 13mA. Thus, to enable Read Bias Current variations, both Total Retry 
and MR_RECOV_PARMS are important. 
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Appendix A: Figure 1.0 
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